Amplitude Modulation Maps for Robust Speech Recognition

نویسنده

  • G. F. Meyer
چکیده

Two recognition tasks are discussed in which pre-processing based on amplitude modulation (AM) maps is compared with other feature extraction strategies. In the first task we show how the AM map representation can be used to segregate voiced speech signals from one another. The second shows how the AM representation can be used for robust digit recognition in additive noise. Natural vowels from the TIMIT database are presented concurrently with a second vowel and recognised using a multilayer perceptron. AM map based pre-processing is compared with that of Parsonsí harmonic selection algorithm and a strategy using no noise reduction. The proposed feature extraction algorithm leads to an improvement in recognition equivalent to a 6 dB increase in signal-to-noise ratio (SNR) over the other algorithms. Digits (from OGI Alphadigits) were presented in clean, in white noise and in rapidly varying high-pass/low-pass noise conditions. Recognition performance, based on an 8 state left-toright hidden Markov model (HMM), is compared for conventional mel-scale cepstral coefficients (MFCCs), auditory filterbank output, and the spectra recovered from AM maps. For clean speech we obtain error rates of 6-8% for all three strategies but as the noise level increases recognition scores consistently show AM maps to be the more robust strategy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust energy demodulation based on continuous models with application to speech recognition

In this paper, we develop improved schemes for simultaneous speech interpolation and demodulation based on continuous-time models. This leads to robust algorithms to estimate the instantaneous amplitudes and frequencies of the speech resonances and extract novel acoustic features for ASR. The continous-time models retain the excellent time resolution of the ESAs based on discrete energy operato...

متن کامل

Amplitude Modulation Filters as Feature Sets for Robust ASR: Constant Absolute or Relative Bandwidth?

Many research efforts in the field of feature extraction for automatic speech recognition are focused on analyzing slow amplitude fluctuations of speech. In this study the importance of spectral and temporal resolution for the amplitude modulation frequency analysis are investigated in order to provide guidance for the appropriate filter design. Therefore, different wavelet and Fourier transfor...

متن کامل

Smoothed Nonlinear Energy Operator-Based Amplitude Modulation Features for Robust Speech Recognition

In this paper we present a robust feature extractor that includes the use of a smoothed nonlinear energy operator (SNEO)-based amplitude modulation features for a large vocabulary continuous speech recognition (LVCSR) task. SNEO estimates the energy required to produce the AM-FM signal, and then the estimated energy is separated into its amplitude and frequency components using an energy separa...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003